Search CORE

21 research outputs found

Where and Who? Automatic Semantic-Aware Person Composition

Author: Barnes Connelly
Bernier Crispin
Cohen Benjamin
Ordonez Vicente
Tan Fuwen
Publication venue
Publication date: 02/12/2017
Field of study

Image compositing is a method used to generate realistic yet fake imagery by inserting contents from one image to another. Previous work in compositing has focused on improving appearance compatibility of a user selected foreground segment and a background image (i.e. color and illumination consistency). In this work, we instead develop a fully automated compositing model that additionally learns to select and transform compatible foreground segments from a large collection given only an input image background. To simplify the task, we restrict our problem by focusing on human instance composition, because human segments exhibit strong correlations with their background and because of the availability of large annotated data. We develop a novel branching Convolutional Neural Network (CNN) that jointly predicts candidate person locations given a background image. We then use pre-trained deep feature representations to retrieve person instances from a large segment database. Experimental results show that our model can generate composite images that look visually convincing. We also develop a user interface to demonstrate the potential application of our method.Comment: 10 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

Author: Cascante-Bonilla Paola
Ordonez Vicente
Qi Yanjun
Tan Fuwen
Publication venue
Publication date: 10/12/2020
Field of study

In this paper we revisit the idea of pseudo-labeling in the context of semi-supervised learning where a learning algorithm has access to a small set of labeled samples and a large set of unlabeled samples. Pseudo-labeling works by applying pseudo-labels to samples in the unlabeled set by using a model trained on the combination of the labeled samples and any previously pseudo-labeled samples, and iteratively repeating this process in a self-training cycle. Current methods seem to have abandoned this approach in favor of consistency regularization methods that train models under a combination of different styles of self-supervised losses on the unlabeled samples and standard supervised losses on the labeled samples. We empirically demonstrate that pseudo-labeling can in fact be competitive with the state-of-the-art, while being more resilient to out-of-distribution samples in the unlabeled set. We identify two key factors that allow pseudo-labeling to achieve such remarkable results (1) applying curriculum learning principles and (2) avoiding concept drift by restarting model parameters before each self-training cycle. We obtain 94.91% accuracy on CIFAR-10 using only 4,000 labeled samples, and 68.87% top-1 accuracy on Imagenet-ILSVRC using only 10% of the labeled samples. The code is available at https://github.com/uvavision/Curriculum-LabelingComment: In the 35th AAAI Conference on Artificial Intelligence. AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers

Author: Bulat Adrian
Dudziak Lukasz
Li Hongsheng
Martinez Brais
Pan Junting
Tan Fuwen
Tzimiropoulos Georgios
Zhu Xiatian
Publication venue
Publication date: 01/01/2022
Field of study

Self-attention based models such as vision transformers (ViTs) have emerged as a very competitive architecture alternative to convolutional neural networks (CNNs) in computer vision. Despite increasingly stronger variants with ever-higher recognition accuracies, due to the quadratic complexity of self-attention, existing ViTs are typically demanding in computation and model size. Although several successful design choices (e.g., the convolutions and hierarchical multi-stage structure) of prior CNNs have been reintroduced into recent ViTs, they are still not sufficient to meet the limited resource requirements of mobile devices. This motivates a very recent attempt to develop light ViTs based on the state-of-the-art MobileNet-v2, but still leaves a performance gap behind. In this work, pushing further along this under-studied direction we introduce EdgeViTs, a new family of light-weight ViTs that, for the first time, enable attention-based vision models to compete with the best light-weight CNNs in the tradeoff between accuracy and on-device efficiency. This is realized by introducing a highly cost-effective local-global-local (LGL) information exchange bottleneck based on optimal integration of self-attention and convolutions. For device-dedicated evaluation, rather than relying on inaccurate proxies like the number of FLOPs or parameters, we adopt a practical approach of focusing directly on on-device latency and, for the first time, energy efficiency. Specifically, we show that our models are Pareto-optimal when both accuracy-latency and accuracy-energy trade-offs are considered, achieving strict dominance over other ViTs in almost all cases and competing with the most efficient CNNs. Code is available at https://github.com/saic-fi/edgevit.Comment: Accepted in ECCV 202

arXiv.org e-Print Archive

Queen Mary Research Online

The 5th International Conference on Biomedical Engineering and Biotechnology (ICBEB 2016)

Author: Ailong Cai
Baiying Lei
Baiying Lei
Baodong Gai
Baoliang Sun
Bin Wang
Bin Yan
Binquan Li
Changyu Tu
Chengxin Yan
Chiehhsuan Wei
Chunlan Yang
Chunlan Yang
Chunlan Yang
Cong Xu
Daisheng Luo
Daisheng Luo
Dong Ni
Dongyan Yang
Fang Han
Farnaz Farokhian
Farnaz Farokhian
Feng Shi
Feng Zhao
Fuwen Lai
Guanyu Li
Guixue Liu
Haibing Bu
Haijun Lei
Haizhu Xie
Hao Fang
Hasan Demirel
Hua Zhong
Huihong Gong
Huihui Yang
Iman Beheshti
Ioannis Manousakas
Jian Zhang
Jianping Yin
Jie Yang
Jie Yang
Jie Yang
Jiechuan Ren
Jiejue Ma
Jing Xiong
Jingke Zhang
Jingwen Zhuang
Junghua Ho
Junzheng Zheng
Juyoung Park
Ke Gan
Ke Gan
Keming Mao
Keming Mao
Kuan Li
Kyungtae Kang
Lanhua Zhang
Lei Li
Lili Zhao
Linyuan Wang
LiSha Tan
Manning Wang
Mao Wang
Mei Bai
Meixia Su
Minghua Zhao
Mingwu Jin
Mingyue Ding
Nan Fu
Nan Fu
Nan Fu
Ning Mao
Ping Sun
Preetha Phillips
Qi Mao
Qiang Liu
Qingchun Li
Qun Wang
Qun Wang
Rongmao Li
Rongmao Li
Shaode Yu
Shaode Yu
Shaode Yu
Shaomao Lv
Shaoqing Wang
Shaowu Li
Shaoyin Duan
Shengli Li
Shihou Sheng
Shuguang Zhao
Shuicai Wu
Shuicai Wu
Shuihua Wang
Shuo Li
Shuo Li
Shuo Li
Shuwen Chen
Sidan Du
Simin Lin
Siping Chen
Song Gao
Soyeun Kim
Tao Gong
Tao Gong
Tao Gong
Tianfu Wang
Tianxu Zhang
Wan Li
Wan Li
Wan Li
Wangsheng Lu
Wei Liu
Wei Peng
Wensheng Li
Wenyu Liang
Xianbin Cheng
Xiancun Yang
Xiaohui Hu
Xiaolei Song
Xiaolong Sun
Xin Zhang
Xin Zhang
Xinnuan Mu
Xuming Zhang
Y. F. Li
Yafeng Zhan
Yan Zhang
Yanchun Zhu
Yanchun Zhu
Yanchun Zhu
Yanhong Zhou
Yanhui Ding
Yaoqin Xie
Yaoqin Xie
Yaoqin Xie
Yaping Wang
Yifei Liu
Yijie Ren
Yin Chang
Yingnan Nie
Yingnan Nie
Yixian Liu
Yongchao Wang
Yonghong Liu
Yongxin Zhang
Yudong Zhang
Yulu Song
Yun Liang
Yupei Chen
Yuxiang Wu
Zeyuan Lu
Zhang Yang
Zhen Yu
Zhengchao Dong
Zhenghao Shi
Zhenghua Huang
Zhijian Song
ZhiJun Gao
Zhimin Chen
Zhuofu Deng
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Springer - Publisher Connector

Institutional Repository of Yantai Institute of Coastal Zone Research, CAS

FaceCollage : a rapidly deployable system for real-time head reconstruction for on-the-go 3D telepresence

Author: Cai Jianfei
Cham Tat-Jen
Deng Teng
Fu Chi-Wing
Tan Fuwen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

This paper presents FaceCollage, a robust and real-time system for head reconstruction that can be used to create easy-to-deploy telepresence systems, using a pair of consumer-grade RGBD cameras that provide a wide range of views of the reconstructed user. A key feature is that the system is very simple to rapidly deploy, with autonomous calibration and requiring minimal intervention from the user, other than casually placing the cameras. This system is realized through three technical contributions: (1) a fully automatic calibration method, which analyzes and correlates the left and right RGBD faces just by the face features; (2) an implementation that exploits the parallel computation capability of GPU throughout most of the system pipeline, in order to attain real-time performance; and (3) a complete integrated system on which we conducted various experiments to demonstrate its capability, robustness, and performance, including testing the system on twelve participants with visually-pleasing results.NRF (Natl Research Foundation, S’pore

DR-NTU (Digital Repository of NTU)

High-quality Kinect depth filtering for real-time 3D telepresence

Author: Cai Jianfei
Cham Tat-Jen
Fu Chi-Wing
Tan Fuwen
Tang Chi-Keung
Zhao Mengyao
Publication venue
Publication date: 01/01/2013
Field of study

3D telepresence is a next-generation multimedia application, offering remote users an immersive and natural video-conferencing environment with real-time 3D graphics. Kinect sensor, a consumer-grade range camera, facilitates the implementation of some recent 3D telepresence systems. However, conventional data filtering methods are insufficient to handle Kinect depth error because such error is quantized rather than just randomly-distributed. Hence, one could often observe large irregularly-shaped patches of pixels that receive the same depth values from Kinect. To enhance visual quality in 3D telepresence, we propose a novel depth data filtering method for Kinect by means of multi-scale and direction-aware support windows. In addition, we develop a GPU-based CUDA implementation that can perform real-time depth filtering. Results from the experiments show that our method can reconstruct hole-free surfaces that are smoother and less bumpy compared to existing methods like bilateral filtering. © 2013 IEEE

Hong Kong University of Science and Technology Institutional Repository

Wireless Monitoring System Based on the Non-uniform Stratified WSN in Viticulture

Author: Binfang Cao
Dongjian
Fuwen Wang
Jianqi
Jianqi Li
Liscano
Matese
Mingzhou
Qun
Rehman
Xiaofeng Li
Xinzhong
Zhen Tan
Publication venue: 'Academy Publisher'
Publication date
Field of study

Crossref

Tel-Aviv Univ.

Author: Daniel Cohen-or
Fuwen Tan
Hao Zhang
Hui Huang
Minglun Gong
Shenzhen Visuca Key
Simon Fraser Univ
Yaobin Ouyang
Publication venue
Publication date
Field of study

Figure 1: Composing parts, possibly with sharp features and non-overlapping boundaries, presents challenges to both part alignment and blending. Our field-guided approach (see middle for a visualization of the fields) leads to alignment of parts away from each other and feature-conforming surface blending. The bridging surfaces generated (colored yellow on the right) are piecewise smooth. We present an automatic shape composition method to fuse two shape parts which may not overlap and possibly contain sharp features, a scenario often encountered when modeling man-made objects. At the core of our method is a novel field-guided approach to automatically align two input parts in a feature-conforming manner. The key to our field-guided shape registration is a natural continuation of one part into the ambient field as a means to introduce an overlap with the distant part, which then allows a surface-tofield registration. The ambient vector field we compute is featureconforming; it characterizes a piecewise smooth field which respects and naturally extrapolates the surface features. Once the two parts are aligned, gap filling is carried out by spline interpolation between matching feature curves followed by piecewise smooth least-squares surface reconstruction. We apply our algorithm to obtain feature-conforming shape composition on a variety of models and demonstrate generality of the method with results on parts with or without overlap and with or without salient features

CiteSeerX